Using Moldability to Improve the Performance of Supercomputer Jobs
نویسندگان
چکیده
In most parallel supercomputers, submitting a job for execution involves specifying (i) how many processors are to be allocated to the job, and (ii) for how long these processors are to be available to the job. Since most jobs are moldable (i.e. there is a choice on how many processors the job uses), the user typically has to decide how many processors to ask for a given job, and estimate the request time accordingly. In this paper, we show that the request that submits a moldable job can be automatically selected in a way that often reduces the job’s turn-around time. More precisely, we introduce and evaluate SA, an application scheduler that chooses, on behalf of the user, which request to use to submit a moldable job. The user provides SA with a set of possible requests that can be used to submit a given moldable job. SA estimates the turn-around time of each request based on the current state of the supercomputer, and then forwards to the supercomputer the request with the smallest expected turnaround time. The conditions under which SA is studied cover variations on the characteristics of the job, the state of the supercomputer, and the information available to SA. The results show that SA often improves the turn-around time of the job under a variety of conditions.
منابع مشابه
Using Moldability to Improve the Performance of Supercomputer Jobs PhD Thesis
Distributed-memory parallel supercomputers are an important platform for the execution of high-performance parallel jobs. In order to submit a job for execution in most supercomputers, one has to specify the number of processors to be allocated to the job. However, most parallel jobs in production today are moldable. A job is moldable when the number of processors it needs to execute can vary, ...
متن کاملUsing Moldability to Improve Scheduling Performance of Parallel Jobs on Computational Grid
In a computational grid environment, a common practice is try to allocate an entire parallel job onto a single participating site. Sometimes a parallel job, upon its submission, cannot fit in any single site due to the occupation of some resources by running jobs. How the job scheduler handles such situations is an important issue which has the potential to further improve the utilization of gr...
متن کاملHPC Usage Behavior Analysis and Performance Estimation with Machine Learning Techniques
Most researchers with little high performance computing (HPC) experience have difficulties productively using the supercomputing resources. To address this issue, we investigated usage behaviors of the world’s fastest academic Kraken supercomputer, and built a knowledge-based recommendation system to improve user productivity. Six clustering techniques, along with three cluster validation measu...
متن کاملA Model for Moldable Supercomputer Jobs
The performance of supercomputer schedulers is influenced by the workloads that serve as their input. Realistic workloads are therefore critical to evaluate how supercomputer schedulers perform in practice. There has been much written in the literature about rigid parallel jobs, i.e. jobs that require partitions of a fixed size to run. However the majority of the parallel jobs in production tod...
متن کاملModeling, Evaluating, and Improving the Performance of Supercomputer Scheduling
The most popular scheduling policy for parallel systems is FCFS with backfilling (a.k.a. “EASY” scheduling), where short jobs are allowed to run ahead of their time provided they do not delay previously queued jobs (or at least the first queued job). This mandates users to provide estimates of how long jobs will run, and jobs that violate these estimates are killed so as not to violate subseque...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Parallel Distrib. Comput.
دوره 62 شماره
صفحات -
تاریخ انتشار 2002